Genomic diversity in Fructobacillus spp. isolated from fructose-rich niches

The Fructobacillus genus is a group of obligately fructophilic lactic acid bacteria (FLAB) that requires the use of fructose or another electron acceptor for their growth. In this work, we performed a comparative genomic analysis within the genus Fructobacillus by using 24 available genomes to evaluate genomic and metabolic differences among these organisms. In the genome of these strains, which varies between 1.15- and 1.75-Mbp, nineteen intact prophage regions, and seven complete CRISPR-Cas type II systems were found. Phylogenetic analyses located the studied genomes in two different clades. A pangenome analysis and a functional classification of their genes revealed that genomes of the first clade presented fewer genes involved in the synthesis of amino acids and other nitrogen compounds. Moreover, the presence of genes strictly related to the use of fructose and electron acceptors was variable within the genus, although these variations were not always related to the phylogeny.


Introduction
The genus Fructobacillus is a group of rod-shaped heterofermentative lactic acid bacteria (LAB) that was described just over a decade ago by Endo and Okada [1]. Initially, this genus was included in the Leuconostocaceae family but in 2020, due to its phylogenetic position and the morphological and biochemical characteristics of its members, it was placed into the Lactobacillaceae family [2,3]. Up to date, the genus Fructobacillus is composed of eleven species: F. durionis, F. fructosus (type species), F. ficulneus, F. pseudoficulneus, F. tropaeoli, F. papyriferae, F. papyrifericola, F. broussonetiae, F. parabroussonetiae, F. cardui and F. apis. The Fructobacillus species are classified as obligatory fructophilic lactic acid bacteria (FLAB) [4], as a result of their preference for D-fructose over D-glucose as growth substrate, related to their requirement for an electron acceptor (oxygen, pyruvate, or fructose) during glucose dissimilation. All members of this genus produce equimolar amounts of lactic acid and acetic acid and a small amount of ethanol as main end-products [2,4,5]. The Fructobacillus species only ferment a limited number of carbohydrates, mainly D-glucose and D-fructose; and some species are known as osmotolerant [1,4,6]. These organisms have been found in flowers, fruits, and insects associated with these environments or related fermented products; all niches linked to high fructose content [7][8][9][10][11][12][13][14].
Due to their unique features, and dominance and adaptation to specific environments, some Fructobacillus species have been studied to assess their technological potential. Since Fructobacillus and other FLAB organisms have been found in environments associated with bees (high fructose consumer insects) [8,10,12,[15][16][17], several studies have reported the potential of Fructobacillus and its by-products to improve the health of honey bees [12,18]. These findings are highly relevant given that bees are pollinators par excellence in nature, and they are declining worldwide. In addition, the genus Fructobacillus deserves a marked interest for its potential application in the food industry. The relevance of some species in spontaneous food fermentation was evidenced in some processes due to their dominance, such as in Tempoyak and cocoa bean fermentation [19][20][21]. Fructobacillus organisms are able to metabolize fructose preferentially and colonize unusual niches, features of great interest for their exploitation in food fermentation [22]. A reduction of fructose content in food is desirable since a high intake of this sugar contributes to multiple health consequences, such as insulin resistance, obesity, liver disorders, diabetes and high blood pressure [23,24]. Fructobacillus organisms can consume fructose by two pathways: i) as energy substrate (through the phosphoketolase pathway), producing lactic and acetic acids as main fermentation products, and ii) as electron acceptor, reducing this sugar to mannitol [22,25]. Mannitol is a naturally occurring polyol that is mainly employed as a low-calorie sweetener in food manufacturing. Due to its zero glycemic and insulinemic index, it is suitable as food constituent in people suffering diabetes [26]. Moreover, it also contributes to increase shelf-life of food by reducing the crystallization tendency of sugars [27]. In this regard, Fructobacillus organisms are efficient mannitol producers, as previously observed in recent studies [28][29][30]. High amounts (82 g/L) of high-quality mannitol from fructose were obtained under optimized conditions with F. tropaeoli CRL 2034 [29]. Conversion efficiency of mannitol by this strain is one of the highest reported for a LAB strain up to date [22,31,32].
FLAB are able to extend the shelf life and increase the antioxidant level of food [4]. Acetic acid, mandatorily produced by Fructobacillus organisms to counteract the deficiency in ethanol synthesis, exhibits inhibitory effects against certain food spoilage-associated microorganisms [33][34][35][36]. FLAB are also capable of modifying plant secondary metabolites during plant fermentation, enhancing the functional and nutritional properties of plant-based products [22]. For instance, F. fructosus strains have been shown to convert p-coumaric and caffeic acid to phenolic acid derivatives with higher biological activities than their precursors [15]. Additionally, some Fructobacillus strains can reduce the fermentable oligosaccharides, disaccharides, monosaccharides, and polyols (FODMAPs) present in food, reducing the risk of the onset of irritable bowel syndrome (IBS) symptoms and other functional gut disorders [37].
Advances in sequencing technologies and the constant development of new or more powerful bioinformatics tools have led to a breakthrough in genomic studies. In this context, the number of available microbial genomes is constantly increasing. Due to the relatively new characterization of the genus Fructobacillus, a limited number of genomes have been described. Only a few genomic studies on Fructobacillus genomes have been published but they have been enough to reveal that these bacteria have adapted to their specific niches through reductive evolution [25,38]. Endo et al. [38] performed a comparative genomic analysis between the draft genomes of five Fructobacillus spp. and nine Leuconostoc spp strains. The results showed that Fructobacillus spp. had a smaller genome size (1.49 ± 0.30 Mbp), higher G+C content (� 44%) and fewer protein-coding sequences (CDSs) than Leuconostoc spp. Furthermore, these authors concluded that Fructobacillus showed a reduction in the number of genes involved in carbohydrate transport and metabolism (5.1% in Fructobacillus vs 8.8% in Leuconostoc) as well as the number of genes related to energy production and conversion, suggesting the existence of simpler energy systems. These genomic analyses showed that Fructobacillus strains possess none or at most one gene for the phosphotransferase system (PTS), which is a major transport system in LAB. This niche-specific reductive adaptation, also observed in other FLAB such as Apilactobacillus kunkeei, A. apinorum, and Fructilactobacillus florum [39][40][41] would be a way to simplify cell metabolism considering nutrient availability in fructose-rich niches [22]. In addition, Fructobacillus spp. were reported to be the first heterofermentative LAB to lack the adhE gene, which encodes a bifunctional alcohol-acetaldehyde dehydrogenase [38]. The absence of adhE does not allow this genus to regenerate NAD + by converting acetyl-CoA to ethanol. Therefore, NAD + is regenerated through the conversion of fructose to mannitol, step catalyzed by the mannitol 2-dehydrogenase enzyme (MDH) [2].
Although the genomic properties of some Fructobacillus spp. were compared to closely related genus genomes [38], differences within the Fructobacillus genus have not been studied in detail yet. In addition, genomes of F. papyriferae, F. papyrifericola, F. broussonetiae and F. parabroussonetiae, F. cardui and F. apis were not previously used in a comparative analysis. In this study, we deepened the knowledge on the bacterial metabolism of all Fructobacillus members from a genomic viewpoint, which may provide relevant information of their biotechnological potential. Thus, a comparative genomic analysis of the genus Fructobacillus using all available genomes to the time of writing this article was performed to investigate its pangenome, characterize its mobilome, and compare some metabolic pathways within the group.

Bacterial genomes and DNA extraction
In this study, all available Fructobacillus genomes to date (December 2022) were used. Twentytwo available online sequences were retrieved from the National Center for Biotechnology Information (NCBI) database. Genome GenBank accession numbers are listed in Table 1. The draft genome of the mannitol-producer F. tropaeoli CRL 2034, previously sequenced and characterized by our team [42], was included in this comparative analysis. Furthermore, the genome of the strain Fructobacillus sp. CRL 2054, isolated from ripe fig fruit (26.8241405 S 65. 2226028 W) in Tucumán, Argentina [30], was sequenced and included in this study.
DNA extraction from the Fructobacillus sp. CRL 2054 strain was done using cells from a pure culture single colony; then, cells were washed and inoculated in FYP broth [6] with 20 g/L of fructose and 10 g/L of glucose at 30˚C without shaking. Before the cells reached the stationary phase, they were centrifuged at 4000 x g for 10 min, and the cell pellet was resuspended in 500 μL of a cryopreservative liquid provided by the sequencing company. The resuspended cells were transferred into tubes with solid phase reversible immobilization (SPRI) beads, mixing by inversion 10 times. Beads were washed with extraction buffer containing lysozyme and RNase A, and incubated at 37˚C for 25 min. Proteinase K and RNaseA were added and incubated at 65˚C for 5 min. Genomic DNA was purified using an equal volume of beads and resuspended in EB buffer (10 mM Tris-Cl, pH 8.5). DNA was quantified in triplicate with the Quantit dsDNA HS assay (Thermo Fisher Scientific, Indianapolis, USA) in an Eppendorf AF2200 plate reader (Eppendorf, Mississauga, Canada).

Genome sequencing and de novo assembly
The genomic DNA of the strain CRL 2054 was sequenced by using a whole genome shotgun (WGS) strategy by MicrobesNG (https://microbesng.com). Genomic DNA libraries were prepared using Nextera XT Library Prep Kit (Illumina, San Diego, USA), following the manufacturer's protocol with few modifications: two nanograms of DNA were used as input, and PCR  [43]. The quality was assessed using in-house scripts combined with SAMTools, BEDTools and the BWA-MEM software. De novo assembly of reads was performed using SPAdes version 3.7 [44].

Characterization and functional annotation of genomes
The assembly metrics and GC content (%) of the studied genomes were determined with QUAST [45]. Completeness and contamination percentages in all genomes were calculated with CheckM [46] using a set of marker genes for organisms belonging to the order Lactobacillales. A search for specific genes related with horizontal transfer elements, antimicrobial functions, and antibiotic resistance was also performed in these genomes (https://crisprcas.i2bc. paris-saclay.fr/CrisprCasFinder/Index; https://phaster.ca; https://isfinder.biotoul.fr; http:// bagel4.molgenrug.nl; https://cge.cbs.dtu.dk/services/ResFinder-4.1). CRISPRCasFinder [47] and Phaster tool [48] were used to find and characterize putative CRISPR-Cas systems and prophage regions, respectively. ISfinder database [49] was used to identify genes related to insertion sequence (IS) elements. Moreover, the identification of bacteriocin-coding sequences was performed with BAGEL4 [50], and antibiotic resistance genes were found by comparison against ARG-ANNOT [51] and ResFinder [52] databases. To keep uniformity in the analysis, the prediction of coding sequences (CDS) and functional annotation of genes in all genomes was done using Prokka [53]. Genomes were also annotated in the RAST server and its SEEDViewer tool was used to confirm the presence or absence of some genes of interest and their genomic context. To search for metabolic differences in the studied Fructobacillus organisms, the genes of the studied genomes were grouped into metabolic categories by comparison with the KEGG (Kyoto Encyclopedia of Genes and Genomes) and COG (Cluster of Orthologous Genes) databases, using the tools BlastKOALA [54], and eggNOG mapper [55], respectively. Furthermore, the dbCAN2 tool [56] was also used for the screening of genes related to sucrose metabolism and biosynthesis of exopolysaccharides. The obtained data and additional information from KEGG database were used to make a comparative analysis of the presence or absence of genes involved in the central metabolism of these bacteria.

Phylogenetic and pangenome analyses
Three phylogenetic analyses were performed on Fructobacillus using different approaches. Initially, a phylogenetic tree was obtained based on the 16S rRNA sequences of Fructobacillus and related organisms. Metagenome-assembled genomes (Fructobacillus genomes MAG1 to MAG4), were excluded from this analysis, since the sequence of the 16rRNA gene was not found in these genomes. Sequences were aligned with ClustalW; poorly aligned regions were manually trimmed. The tree was obtained applying the Maximum-likelihood method with IQTREE [57]. The TIM3+F+R3 substitution model, previously determined as the best-fit substitution model for this dataset was used for tree inference. The root was fixed using the sequence of Lactobacillus delbrueckii subsp. delbrueckii BCRC 12195 as an outgroup member. Additionally, the identity percentages among the available 16S-rRNA sequences of Fructobacillus organisms were calculated performing a global alignment in CLUSTALOmega [58].
A second tree was designed using single-copy core genes present in all Fructobacillus organisms and Leuconostoc mesenteroides ATCC 8293 T (used as outgroup). Genes were aligned using MAFFT with default parameters and then concatenated. Poorly aligned regions were removed with Gblocks [59]. The best evolutionary model was determined for each codon position (1 st , 2 nd , or 3 rd ), and the tree was inferred with IQTREE [57] applying the Maximum-likelihood method. In both trees, the robustness of the branches was measured by ultrafast bootstrapping (UFB) of 10,000 replicates.
Finally, a third phylogenetic approach was performed to find the rate of recombinant sites respect to mutations in the Fructobacillus core genome. To this end, an initial Maximum-likelihood tree using the core genes of the studied genomes was obtained with IQTREE. Then, an analysis of recombinant sites in the core-genome alignment was done by ClonalFrameML software [60]. The R/θ parameter, known as the ratio of recombination to mutation, was calculated with the EM algorithm by performing 100 simulations.
To estimate the number of core and accessory genes in the studied genomes, groups of orthologous genes were identified through GET_HOMOLOGUES software [61]. The OrthoMCL algorithm was chosen to cluster genes in orthologous groups [62]. The default identity threshold was modified by up to 40% to fit a genus analysis.

Statistical analyses
Wilcoxon's non-parametric test (Mann-Whitney U) was applied to compare the number of genes involved in the COG categories and KEGG modules between the two Fructobacillus groups. Analyses were performed using the InfoStat Statistical Software (Universidad Nacional de Córdoba, Córdoba, Argentina). Furthermore, clustered heatmaps were obtained in all cases by using the Pheatmap R package.

General characteristics of the studied Fructobacillus genomes
Twenty-four genomes of Fructobacillus strains were studied to identify genomic differences within this genus. NCBI accession numbers, along with genome features and assembly statistics are summarized in Table 1.
The genome of Fructobacillus sp. CRL 2054 was successfully sequenced with 99.07x coverage. The resulting draft sequence after the de novo assembly contained 1,326,779 bp divided into 28 contigs higher than 200 bp. The N50 parameter, related to the quality of assembly, was higher than 100 kb.
Only one of the studied genomes presented complete status (Fructobacillus sp. KI3_B9accession number CP097122.1), whereas the rest of the genomic sequences were fragmented into contigs or scaffolds (Table 1), 20 of them presenting less than 50 contigs. No plasmidic DNA was found in any of the studied strains. Despite the draft status of the majority of the studied genomes, all the sequences showed more than 97% completeness and low contamination (below 2%). These data indicate that all genomes present a near-complete status according to Parks et al. [46], and are suitable for the comparative genomic analysis.
In general, a small genome size was observed in the genomes of all

Phylogenetic analyses in Fructobacillus
A phylogenetic tree was made using 16S rRNA sequences of Fructobacillus and related organisms ( Fig 1A). Two clades could be distinguished among the Fructobacillus sequences. The first clade was composed of Fructobacillus sp. CRL 2054, F. durionis DSM 19113 T , F. fructosus strains, F. papyriferae strains, F. papyrifericola M1-21 T , F. broussonetiae M2-14 T , F. parabroussonetiae S1-1 T , F. apis W13 T and Fructobacillus sp. M158, while the second clade consisted of F. tropaeoli strains, F. pseudoficulneus DSM 15468 T , F. ficulneus JCM 12225 T , F. cardui M131 T , Fructobacillus sp. KI3_B9 and Fructobacillus sp. EFB-N1. According to this tree, the 16s rRNA sequence of the clade 1 showed a markedly higher divergence against its common ancestor when compared to the clade 2. In addition, the identity values of the 16S rRNA gene between pairs of strains were used to design a clustered heatmap (Fig 2). Organisms of the same phylogenetic clade were clustered together. The sequences presented more than 96% identity between organisms of the same clade and 94 to 95% identity between organisms of opposite clades, indicating that this gene is highly different between both groups. In the same way, another phylogenetic tree was inferred by the Maximum Likelihood method using 656 core genes present in Fructobacillus strains and L. mesenteroides ATCC 8293 T (used as outgroup) (Fig 1B). The resulting alignment (554,163 bp-long after trimming poorly aligned regions) was used for the tree inference. The two groups of strains already described were also located in different clades in this tree. In addition, F. fructosus MAG1 to MAG3 strains were located along with other F. fructosus strains in clade 1, whereas Fructobacillus sp. MAG4 belonged to clade 2. As previously observed in the 16S tree, members of the first clade were also more distant from the common ancestor than organisms belonging to the second clade.
Additionally, the events of homologous recombination in the core genome of the studied genomes were determined through a Maximum-likelihood approach by using the ClonalFra-meML software. This approach allowed to identify 2132 recombinant events among the studied Fructobacillus genomes. The value of the R/θ parameter (ratio of recombination events respect to mutations) was 0.0112 (± SD: 1,43E-03), indicating that mutation events occur at roughly 90 times more than recombination in the core genome of the studied Fructobacillus.

Identification of prophages, CRISPR-Cas9 systems and bacteriocin-and antibiotic resistance-encoding genes
Genomic regions related to horizontal gene transfer (HGT), antimicrobial properties, and antibiotic resistance in the Fructobacillus genomes were identified ( Table 2). Only prophages with PHASTER score higher than 70 (complete or questionable state) were analyzed. At least one prophage region was present in nineteen out of the twenty-four genomes, these regions were present in organisms of both phylogenetic clades described before. Twelve genomes contained one prophage, whilst seven genomes contained two or more prophage regions. Of all 28 prophages identified, nineteen regions were intact (18-55 kb in size), and nine contained partial regions (15-33 kb), which were usually located at the start or end of a contig (S1 Table). All prophages presented similar GC content (34.0-43.4%). Genes coding for a terminase, protease, coat protein, portal protein, tail shaft, and other phage-related proteins represented more than 50% of total genes, while the rest were not associated with any known function. A gene coding for a plate protein was only found in one of the prophages of F. tropaeoli F214-1 T . Furthermore, an attachment site, which is necessary for phage insertion into the bacterial chromosome, was observed in fourteen of the studied prophages. Moreover, different tRNA genes were found in six prophages. A BLASTn search showed certain similarities between nineteen of the twenty-eight Fructobacillus prophages and sequences of other phages, previously found in the metagenome of a honeybee [63]; however, the alignment coverage percentage was low

PLOS ONE
(< 60%) for most of the studied sequences, indicating that most of the Fructobacillus prophages have not been previously described. In most cases, the Fructobacillus prophage sequences (with the exception of one of the prophages of F. tropaeoli F214-1 and F. cardui M131) were similar to sequences of the Siphoviridae family viruses (S1 Table).
CRISPR-Cas systems, widely distributed in bacteria to provide immunity against foreign DNA, were found in seven out of the twenty-four studied genomes ( Table 2). Two of the Fructobacillus genomes harboring these systems belonged to clade 1 (F. fructosus strains), while the rest of the genomes were part of clade 2. According to the CRISPR-Cas systems classification, all identified regions belonged to type IIa Cas systems, presenting genes coding for Cas9, Cas1, Cas2, and Csn2 proteins. A variable number of spacer sequences (4 to 11) between short palindromic repeats were found downstream of these genes. Genomes of F. fructosus strains and F. tropaeoli RD012353 harbored the highest number of spacers (between 7 and 11). F. ficulneus JCM 12225 also presented two sets of CRISPR repeats without Cas genes that were truncated by a contig start.
Insertion sequence (IS) elements, mobile elements of short length (0.7-2.5 kb) that contain genes coding for transposases, responsible for the insertion of these DNA segments in the bacterial genome [64], have been also sought. Several IS transposases, mainly those belonging to IS3 and IS30 families were widespread in the Fructobacillus genomes. Fructobacillus sp.  EFB-N1 stood out containing 11 different transposases, being this value clearly higher than those observed in the other genomes, which contained 0 to 5 transposases only. The identification of genes related to antimicrobial activity in Fructobacillus organisms was also performed. Only F. durionis DSM 19113 T harbored two contiguous regions encoding two peptides of a bacteriocin. These genes showed similarity against peptide chain A (59.52% identity) and peptide chain B (51.67% identity) of the bacteriocin LS2, a member of the class IId bacteriocins produced by Ligilactobacillus salivarius BGH01. A gene coding for an ABC transporter (necessary for peptide export) was also found in this strain near to the bacteriocin-coding genes.
The search for antibiotic resistance genes against the ARG-ANNOT and ResFinder databases showed that Fructobacillus genes did not have significant similarities with antibiotic resistance genes present in the aforementioned databases. However, according to the genome annotation previously obtained with Prokka, seven different genes classified as multidrug resistance genes were identified in the pangenome, from which three emrB genes encoding a multidrug exporter and a gene related with the quaternary ammonium-compound resistance were present in all genomes. Furthermore, two genes involved in the resistance to byciclomycin and disinfectants of the family of quaternary ammonium compounds were only detected in the genomic sequence of F. tropaeoli F214-1 T .

Pangenome analysis of Fructobacillus
For this study, genes from the twenty-four studied genomes were clustered into orthologous groups. This analysis resulted in 4,549 gene clusters, which make up the pangenome of this set of Fructobacillus genomes. Out of the total detected groups, 724 genes were present in all genomes (core genome), and 854 genes were found in 22 genomes or more (soft-core genome). Furthermore, 3,695 groups of genes were part of the dispensable genome, of which 1,155 were present in 3 to 21 genomes (shell genome), and 2,540 were each located in one or two strains (cloud or unique genome). Interestingly, the genes present in 3 to 9 genomes constituted an important part of the shell (868 out of 1,155 genes) (Fig 3A).
The results of the pangenome analysis were also used to cluster the studied organisms based on the presence or absence of genes in the dispensable genome. As shown in Fig 3B, organisms were also clustered in two opposite clades, being this division identical to that observed in the phylogenetic analyses. Moreover, a noticeable number of genes were present in members of clade 2 and absent in strains of clade 1. In the same way, a lower number of genes were present in all genomes of the first clade and in none of the second clade genomes. These findings evidence a clear difference in the genetic content among the studied Fructobacillus strains.

Characterization of Fructobacillus genomes through classification in COG categories and KEGG metabolic modules
Genes from all genomes were grouped into COG categories related to cellular processes and signaling, information storage and processing, metabolic processes, and unknown function (Fig 4A). The ratio of genes assigned in each COG category against the total number of genes in all COGs was determined for each strain and shown in Fig 4. Ratio values between clades 1 and 2 were compared for each COG category by performing the Wilcoxon test. The ratios of genes in organisms of clade 1 were significantly lower than in clade 2 in three categories related to metabolism [E (Amino acid transport and metabolism); H (Coenzyme transport and metabolism) and Q (Secondary metabolites biosynthesis, transport, and catabolism)] and in W category (related with extracellular structures). On the contrary, clade 1 presented higher gene ratios than clade 2 in T (Signal transduction mechanisms), V (Defense mechanisms), and J (Translation, ribosomal structure, and biogenesis) categories. As described above, differences in size were observed in the analyzed genomes. In this way, a possible association between the number of genes in each COG category and genome sizes was assessed by calculating the Pearson correlation coefficient (Fig 4B). The highest positive correlation values (Pearson coefficient > 0.88) were found in the metabolism-associated categories E and H (involved in the transport and metabolism of amino acids and coenzymes, respectively), and in categories C (Energy production and conversion), and S (function unknown). These results could indicate a narrow relationship between the genome size of the studied Fructobacillus strains and their number of genes associated with the metabolism of nitrogen compounds and other cellular processes.
The genes of each genome were also classified into metabolic modules using the KEGG mapper-Reconstruct pathway tool from the KEGG database. Complete and almost complete modules were selected for a comparative analysis among the Fructobacillus genomes. Percentages of completeness of each metabolic module in each strain are shown in Fig 5. Data of L. mesenteroides ATCC 8293 T was also included in the figure to compare the results with a related organism. The number of present blocks (genes or groups of genes forming part of one step in the pathway) were statistically compared between clades for each metabolic module. Large differences were distinguished in metabolic pathways related to amino acid biosynthesis. Genomes of the clade 1 presented a significantly lower number of genes (p < 0.05) associated with the biosynthesis of eleven amino acids (serine, cysteine, methionine, ornithine, arginine, histidine, shikimate, threonine, isoleucine, leucine, and tryptophan) when compared with organisms of the clade 2. Both groups also differed in the number of genes involved in the metabolism of cofactors and vitamins; organisms located in the first clade had significantly fewer genes related to the biosynthesis of tetrahydrofolate and pyridoxal-P. Furthermore, a remarkable difference between both groups in the biosynthesis of the inosine monophosphate nucleotide was also observed, where none or only one of the 8 blocks required for this pathway was found in organisms of the clade 1. Furthermore, the level of completeness of each metabolic module was usually similar between genomes of the second clade and L. mesenteroides ATCC 8293 T . These results confirm a lack in the synthesis of several amino acids and some vitamins and nucleotides in organisms belonging to clade 1 with respect to members of the second clade and the type strain of L. mesenteroides.

Analysis of genes involved in the central metabolism in Fructobacillus
The information retrieved from the KEGG database and the results of pangenome studies were used to reconstruct the central metabolic pathway in Fructobacillus organisms, considering the metabolism of carbohydrates and the use of electron acceptors for the maintenance of redox balance. Additionally, a comparative description was made by analyzing differences in genes involved in the central metabolism among the genomes under study (Fig 6A). Genes for the synthesis of lactate, acetate and carbon dioxide were identified. The 6-phosphogluconate/ phosphoketolase pathway, present in heterofermentative LAB, was almost complete in these organisms, with the exception of the bifunctional adhE gene (involved in the reduction of acetyl-CoA to ethanol through acetaldehyde dehydrogenase and alcohol dehydrogenase activities). The lack of this gene is a feature of the fructophilic behavior of these organisms. Nevertheless, three different families of adh genes with alcohol dehydrogenase activity (EC 1.1.1.1), necessary for ethanol synthesis through acetaldehyde reduction, were distributed in 20 Fructobacillus genomes. Genes encoding for acetaldehyde dehydrogenases were not detected.
Differences in the presence/absence or in the copy number of genes among Fructobacillus genomes were observed in 14 out of total 38 genes involved in central metabolism (Fig 6A). Nine of these genes (fk, L-ldh, alsS, pgk, budC, adh, yjlD, fructosyltransferases and glucosyltransferases) were only present in some of the studied genomes, whereas five genes (glcU, pgi, pgl, gpmA and D-ldh) differed in the number of copies (paralogous genes) among Fructobacillus organisms. Furthermore, from the initial fourteen genes presenting differences among genomes, eight of these were strictly related to the use of fructose and electron acceptors. Within this group, six genes (yjlD, adh, D-ldh, L-ldh, alsS, and budC) were associated with the use of electron acceptors and NAD + regeneration, while two genes (fk and gpi) were involved in the use of fructose as a growth substrate. In addition, two genes (the paralogous gene D-ldh2 and the NADH dehydrogenase gene yjlD) were only present in the genomes of the second clade. Noteworthy, the presence or absence of most genes of the central metabolism was not clade-specific. As observed in Fig 6B, organisms of each phylogenetic clade were widely distributed in different clusters according to the presence/absence of central metabolism-genes, indicating that differences observed in genes of the central metabolism are not related with the phylogeny among organisms. For instance, the budC gene, which is involved in the regeneration of NAD(P) + through reduction of diacetyl and acetoin, was only present in two out of the three members of F. tropaeoli species.
Seventeen Fructobacillus genomes harbored genes with specific domains of fructosyltransferases (ftf-GH68 family), which includes levansucrases and other enzymes involved in the synthesis of different types of fructans that use sucrose as their preferential donor substrate. In the same way, only three genomes (F. tropaeoli RD012353, F. broussonetiae M2-14 T , and F.  (Fig 6B).
Pyruvate can be used for NAD(P)H reoxidation through reduction to lactate by lactate dehydrogenases (LDHs), or by synthesis of aroma compounds (diacetyl and acetoin). Two Dlactate dehydrogenase genes, named D-ldh1 and D-ldh2, were identified in Fructobacillus organisms of clade 2. Although genomes of the clade 1 only contained the D-ldh1 gene, the strain Fructobacillus sp. CRL 2054 stood out by harboring two identical copies (100% identity) of D-ldh1 with its ribosomal binding site (RBS). Moreover, most strains contained one copy of a putative L-ldh gene in their genomes; however, F. durionis DSM 19113 T presented two contiguous L-ldh genes while F. broussonetiae M2-14 T did not harbor any copy. Regarding pyruvate reduction through the synthesis of aroma compounds, the alsS gene (related to the conversion of pyruvate into alpha-acetolactate) and the budC gene (responsible for NAD(P)H reoxidation in this pathway) were found in 20 strains and 6 strains, respectively; the presence of budC being strain-specific (Fig 6B). In addition, the required genes for the synthesis of pyruvate for later use as electron acceptor through the assimilation of citrate were present in the core genome of Fructobacillus (Fig 6A).
Oxygen can also be used as an electron acceptor, being reduced to H 2 O with H 2 O 2 as an intermediate. Three NADH oxidase genes were identified in Fructobacillus, and two of them were included in the core genome. However, the yjlD gene coding for a NADH dehydrogenase-like protein was only found in genomes belonging to the second clade.
Fructose can be used by these organisms as an electron acceptor or as a growth substrate. The reduction of fructose to mannitol is performed by the MDH, whereas the fructokinase (FK) and glucose-6-phosphate isomerase (GPI) enzymes are used for the assimilation of fructose in heterofermentative LAB. A schematic representation of the distribution of genes related to the use of fructose is shown in Fig 6C. The MDH gene (mdh) and a putative fructose nonphosphorylating permease gene (fruP) were located contiguously in all Fructobacillus genomes (Fig 6C). On the contrary, two fk genes (fk1 and fk2) and three gpi genes (gpi1, gpi2 and gpi3) were differentially distributed in the studied strains. The fk1 and gpi1 genes were found in different parts of the genomes of all fructobacilli, except for the F. fructosus and F. apis strains (where fk2 and gpi2 were present) and in F. broussonetiae M2-14 T , where no fk gene was identified according to Prokka and RAST annotations. On the other hand, fk2 and gpi2 were present in some organisms (F. fructosus strains, F. apis W13, F. durionis DSM 19113 and Fructobacillus sp. CRL 2054), being contiguously located in these genomes. Another gene called gpi3 was only detected in F. ficulneus JCM 12225 T (Fig 6C).

Discussion
The genus Fructobacillus has been described as a group of FLAB microorganisms formerly belonging to the genus Leuconostoc that suffered a reduction in their genomes as a consequence of their adaptation to fructose-rich niches. Fructobacillus genomes present fewer genes 6- involved in the metabolism of carbohydrates than other LAB, especially due to the lack of transporters of the phosphotransferase systems (PTS) [38]. Despite these previous findings, more information on the genomic properties of Fructobacillus is needed to improve the knowledge on multilevel system regulation and provide a better understanding of their behavior [22]. Our study is the first comparative genomic analysis of the Fructobacillus genus that includes genomes of the recently described species after February 2022 (papyriferae, papyrifericola, broussonetiae, parabroussonetiae, cardui, and apis species). Phylogenetic relationships were reconstructed using the 16S rRNA as well as the concatenated DNA sequences of 656 core Fructobacillus genes. These phylogenetic trees allowed the clear identification of two wellsupported clades, as previously reported [1,11,38]. Several differences related to sequence similarity, general genomic properties, and gene content between the two clades were observed and further characterized. Regarding sequence similarity, the 16S rRNA identity showed lower values in genomes of opposite clades than those from the same group. Differences in the identity of the 16S ribosomal gene were around 6% for genomes of different groups, this value being the maximum allowed for organisms of the same genus [65]. A similar division of genomes in two clades was observed between the phylogenetic analyses and the clustering based on presence/absence of genes. No plasmids were found in the studied genomes, although lysogeny was widespread and several classes of mobile elements were present in the twenty-four Fructobacillus strains. However, the calculated R/θ values (ratio of recombination events respect to mutations) showed that the rate of homologous recombination events in the Fructobacillus core genome was around 90-fold lower than mutation events, indicating that HGT did not meaningfully affect the genetic content of each clade.
Regarding general genomic properties, most genomes of the clade 1 had smaller sizes than most genomes of the second clade, indicating genome reduction and the consequent decrease in the total number of genes in organisms of the first group. This latter feature was studied in depth by performing a comparative gene functional analysis based on the COG and KEGG databases. Results suggest a reductive evolution in organisms of clade 1, with fewer genes mainly involved in the metabolism of nitrogen compounds; more specifically, in the synthesis of several amino acids, coenzymes and some nucleotides. Several theories have attempted to explain reductive evolution, especially in insect endosymbiotic bacteria and in free-living marine cyanobacterial populations [66]. Of these hypotheses, a higher mutation rate appears to be a key factor for genome reduction in various prokaryotic lineages [67]. According to this theory, organisms with a high mutation rate (called "mutators") can rapidly acquire beneficial mutations as a mechanism of adaptation to environmental changes. Such increases can also lead to further gene loss of dispensable functions. In insect-associated bacteria (such as A. kunkeei and F. fructosus) vertical transmission to insect offspring causes bottlenecks in their population structure, which leads to the fixation of deleterious mutations [67,68]. In this work, the calculated R/θ parameter indicated that in Fructobacillus mutation events occur more often (90X) than recombination. Furthermore, phylogenetic analyses based on different targets (16S rRNA and core genes) showed a greater genetic distance of clade 1 from the common ancestor of Fructobacillus than clade 2, indicating more mutations accumulated in the first group that could be related to the reduction of their genomes. A noticeable mutational bias towards deletions was observed in bacteria in comparison to eukaryotes [69,70]. Therefore, mutations in group 1 would have triggered the loss of several genes for the synthesis of nitrogenous compounds since these are usually available in fructophilic niches (such as fruits, flowers, and the gastrointestinal tract of honeybees). Amino acids and vitamins are usually present in fruits and other plant structures [71]. In the case of the intestinal microbiota of bees, these compounds can be provided by other bacterial members of the same habitat that possess the machinery for the synthesis of all amino acids, such as Gilliamella spp. and Snodgrassella spp. [72]. However, a more exhaustive study with a higher number of Fructobacillus genomes is necessary to confirm a reductive evolution affecting the nitrogen compounds biosynthesis in part of this genus.
Other authors have already described a decrease in the number of genes of the metabolism of nitrogenous compounds in other LAB. An earlier genomic study on Apilactobacillus kunkeei and Fructilactobacillus sanfranciscensis revealed significant gene loss in these species compared to other taxa. From the total lost genes with assigned functions, 22% affected amino acid metabolism, in particular, amino acid biosynthesis [68]. According to these authors, the results are consistent with a shift to a nutritionally-rich growth habitat, such as the gastrointestinal tract of honeybees. In addition, functional differences in gene clusters for proline, tryptophan, leucine, and arginine biosynthesis were observed among A. kunkeei strains, while genes for purine and pyrimidine biosynthesis were lost in one of the studied strains. The identified biosynthetic gene clusters were located in the same genomic regions in all strains, indicating independent losses [68]. Recently, Maeno et al. [73] suggested that amino acid and carbohydrate metabolism/requirement is highly variable among species of the family Lactobacillaceae (including the genus Fructobacillus). They observed a relatively large gradient (61.6) between the number of genes assigned to the COG class E (amino acid transport and metabolism) and the genome size of 174 Lactobacillaceae strains. On the other hand, despite the genomic differences observed in the nitrogen compound-biosynthesis in Fructobacillus, no clear correlation was found between the identified phylogenetic groups in this genus and the reported habitats for each species, recently reviewed by Filannino et al. [22]. Further studies are needed to elucidate if differences in nitrogen metabolism within Fructobacillus genus would impact in the ability to ferment matrices with low content of nitrogen compounds.
Other differences in gene content were found among Fructobacillus genomes, particularly in the central metabolism. However, most genes showed a clade-independent scattered pattern among strains. These genes were mainly focused on steps of NAD(P) + recycling and catabolism of fructose, which indicates the importance of these processes in fructophilic LAB. Duplication events were already observed in ldh genes, as previously described by Bleckwedel et al. [74]. These authors observed high sequence similarity between paralogous genes ldh1 and ldh2 (74% identity), being ldh1 the main responsible of the LDH activity in F. tropaeoli CRL 2034. An identical duplication (100% identity) of the ldh1 gene and its RBS was also detected in Fructobacillus sp. CRL 2054, indicating a recent duplication event in this organism. This observation shows again the importance of duplication of ldh genes in Fructobacillus, although the reason of this phenomenon remains unknown. Bleckwedel et al. [74] found differences in promoter sequences for each gene and suggested a differential expression regulated by specific environmental conditions; nonetheless, the quantification of D-LDH transcripts is needed to confirm this hypothesis.
Regarding other genes related with the central metabolism, a gene involved in the use of oxygen for NAD(P)H reoxidation (yjlD) was strictly present in clade 2 only. NADH dehydrogenases are a key component of the respiratory chain, but no other gene used for the quinone pool was found in Fructobacillus ( [38] and this study), suggesting that yjlD is involved in the oxidation of NAD(P)H under the presence of oxygen. In addition, genes for the synthesis of exopolysaccharides (EPS) were widely spread among the studied strains. Genes encoding putative levansucrases and other fructosyltransferases were found in several Fructobacillus genomes, as previously observed by Endo et al. [38]. In addition, the detection of genes with domains for glycosyltransferases was reported in this work for the first time for Fructobacillus spp. Although Endo et al. [38] failed to detect EPS production in some Fructobacillus type strains, Tahir et al. [75] found two EPS-producer F. fructosus strains able to synthesize levan and dextran.
Optimization of mannitol production by bacteria is of high interest since its technological relevance in food industry [76]. It is known that Fructobacillus organisms are efficient mannitol producers [29] due to its requirement of fructose as electron acceptor [25]. The genes responsible for the mannitol synthesis and its entrance into the cell (mdh and fruP, respectively) were present in all analyzed genomes, indicating the importance of these genes for the fructophilic metabolism in Fructobacillus. However, FK and GPI enzymes can compete against MDH for fructose by phosphorylating this sugar and channeling it into the phosphoketolase pathway, respectively. Affinity of fructokinases for fructose is higher than that of mannitol dehydrogenase, this being a hurdle in mannitol production [77,78]. In this work, remarkable differences were detected in fk and gpi genes among the analyzed genomes. It would be of high relevance to deeply characterize these differences in fk and gpi genes and its regulation mechanisms. Helanto et al. [79] improved the yield of mannitol from fructose by inactivation of the fk gene in L. pseudomesenteroides. These authors also observed a higher rate of fructose consumption when fk was inactivated. According to these results, a low FK activity would be desirable to enhance mannitol biosynthesis and guarantee a rapid decrease of the fructose content in food fermentation. Surprisingly, the deletion of the FK activity was not totally achieved in the fk mutant designed by [79], despite no fk transcript being detected in this strain. A possible reason could implicate other uncharacterized enzymes with FK activity. This hypothesis could explain the absence of an fk gene in F. broussonetiae M2-14 T reported in the present work, and the ability of this strain to grow in fructose as a sole carbon source [11]. Further studies related to the downregulation of FK and GPI activity will allow exploiting the use of these bacteria as mannitol producers in the fermentation of fructose-rich matrices.

Conclusions
A comparative genomic analysis of the currently available Fructobacillus genomes was performed. The results of this study allowed us to distinguish two phylogenetic groups, where the organisms of clade 1 showed a simplified machinery for the biosynthesis of nitrogen compounds. Furthermore, differences were also identified in the presence of mobile elements and in genes with an essential function in the use of fructose and electron acceptors among the studied strains, being these differences clade-independent. These findings contribute to a better understanding of the unusual metabolism of these organisms that may be exploited for future biotechnological applications.
Supporting information S1